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The fundamental 'plasticity' of the nervous system (i.e high adaptability at different structural levels) 
is primarily based on Hebbian learning mechanisms that modify the synaptic connections. The modifi- 
cations rely on neural activity and assign a special dynamic behavior to the neural networks. Another 
striking feature of the nervous system is that spike based information transmission, which is supposed 
to be robust against noise, is noisy in itself: the variance of the spiking of the individual neurons is 
surprisingly large which may deteriorate the adequate functioning of the Hebbian mechanisms. In this 
paper we focus on networks in which Hebbian-like adaptation is induced only by external random noise 
and study spike-timing dependent synaptic plasticity. We show that such 'HebbNets' are able to develop 
a broad range of network structures, including scale-free small-world networks. The development of 
such network structures may provide an explanation of the role of noise and its interplay with Hebbian 
plasticity. We also argue that this model can be seen as a unification of the famous Watts-Strogatz and 
preferential attachment models of small-world nets. 
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1. Introduction 

In the last few years spike-timing dependent 
synaptic plasticity (STDP) (see e.g. (Ref. 1) and ref- 
erences therein) , which is an extension of the classical 
Hebbian learning mechanism, has been the subject 
of intensive research. Recent experiments 2 ' 3 ' 4 (for a 
review, see, e.g. (Ref. 5)) revealed that exact timing 
and temporal dynamics of the neural activities play 
a crucial role in forming the neuronal base of plas- 
ticity. While it is still an open question, whether the 
rate of spikes (that is temporal or population aver- 
aged spike count) or the exact time pattern of the 



spikes carries the information, it is broadly accepted 
in the machine learning literature 6,7 ' 8 and is strongly 
supported in neuronal modelling 9 that spike based 
encoding can be efficient in compression, allows for 
sparse representation, low energy consumption and 
that it can be robust against noise. The last prop- 
erty seems to be indispensable knowing the stochas- 
tic behavior of the neurons and of the external en- 
vironment. But if noise should be suppressed, how 
come that a great part of the signals propagating 
through several brain regions experienced in differ- 
ent species (ranging from frogs to primates) is con- 
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sidered to be internally generated noise 10,11 ? What 
can be the reason for counteracting the perfect in- 
formation processing and transmission? One possi- 
ble role of noise in the nervous system is provided 
by the recognition that noise can enhance the re- 
sponse of nonlinear systems to weak signals, via a 
mechanism known as stochastic resonance (see, e.g., 
(Ref. 12)). However, noisy functioning may have 
additional roles. For example, it has been shown 
that synaptic background activity may promote dis- 
tinguishing very similar inputs 13 . It has been also 
demonstrated 14 that strict conditions on stability of 
Hebbian mechanisms can be released by introducing 
random external noise instead of maintaining com- 
petition among neurons over the input sets. In this 
paper we address the question whether noise may 
have any impact on structural changes. 

In the following, we examine what network struc- 
tures may emerge in a simplistic neural system by 
applying pure Hebbian learning. From now on, this 
neuronal network model will be referred as to Hebb- 
Net. 

2. Description of HebbNet 

We assume that the network is sustained by in- 
puts with no spatio-temporal structure; that is the 
input is random noise. Our models consist of TV num- 
ber of simplified integrate-and-fire-like 'neurons' or 
nodes. The dynamics of the internal activity is writ- 
ten as 

j 

for i = 1, 2, . . . , N. (N was 200 in our simulations.) 
Variable x^ ext>> <E (0, 1)^ denotes the randomly gen- 
erated input from the environment, a, is the internal 
activity of neuron i, Wij is ij th element of matrix 
W, i.e., the connection strength from neuron j to 
neuron i. If At = 1 then we have a discrete-time 
network and each parameter has a time index, or if 
At is infinitesimally small then Eq. 1 becomes a set 
of coupled differential equations. Neuron j outputs 
a spike (neuron j 'fires') when a,j exceeds a certain 
level, the threshold parameter 6. Spiking means that 
the output of the neuron a| (superscript s stands for 
'spiking') is set to 1. After firing, aj is set to zero at 
the next time step for the discrete-time network. For 
the continuous version of Eq. 1 , aj is set to zero after 
a very small time interval. Amount of excitation re- 



ceived by neuron i from neuron j is WijUj. Equation 
1 describes the simplest form of 'integrate-and-fire' 
network models which is still plausible from a neu- 
robiological point of view. Note that if At = 1 and 
the threshold is set to zero (i.e., if a neuron receives 
any excitation then it fires and is reset to zero) then 
Eq. 1 represents 'binary neurons' without temporal 
integration. This can be seen as the simplest model 
within our framework. Also, if the threshold is kept 
and if a* is set to zero before each time step, irrespec- 
tive if the i th neuron fires or not, then the original 
model of McCullough and Pitts 15 is recovered. 

Beyond the local activity threshold, we also ex- 
amined the effect of global activity constraint: at 
each time instant, a given percent of nodes was se- 
lected randomly in proportion to the activity aj for 
alli = l,2,...,iV. These neurons fired at that time 
instant. For these two cases, computer simulations 
showed negligible differences. 

Synaptic strengths arc modified as follows: 

^= E K( tj - ti )a^af s , (2) 

where if is a kernel function which defines the influ- 
ence of the temporal activity correlation on synaptic 
efficacy, ti,tj the spiking times of neuron % and j, re- 
spectively and a i , ' s is the firing activity of neuron i at 
time ti . Awij / At may be taken over discrete or over 
infinitesimally small time intervals. Possible kernels 
are depicted in Fig. 1. The kernel is a function of 
the time differences. Because, in our case, the input 
is noise with no temporal correlation, only the ra- 
tio of the positive (strengthening) and the negative 
(weakening) areas of the kernel function (r^+ ) 
should count. Temporal grouping and reshaping of 
the kernel would not modify our results as long as 
the aforementioned ratio is kept constant and the 
input is pure noise. For this special case, the dif- 
ference between the two kernel types of Fig. 1 does 
not have much impact on the temporal evolution of 
our model network. It should be noted that includ- 
ing inputs with spatiotemporal structure and other 
known details of synaptic plasticity mechanisms, this 
kernel shape independence will not hold. Our only 
constraint on the kernel, namely the constraint that 
r A+/A- < 1, is required to constrain weights. This 
constraint redistributes weight strengths. Empirical 
data indicate that indeed, there are mechanisms to 
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redistribute weight strengths; potentiation for weak 
synapses is favored whereas strong synapses tend to 
be depressed (see, e.g., (Refs. 14, 16, 17)). 



Kernel amplitude 




JC Time difference 




Fig. 1. Kernel functions 

Two temporal kernels as a function of time difference 
between spiking time of neuron i and j (t{ —tj). Rele- 
vant parameter of the shape for noise-sustained systems 
is the ratio (r A + , A - ) of the areas (sums of positive and 
negative parts/components) of the kernel, A + and A~ , 
respectively (r A + i A - — A + /A~). 

In the first place, we have been interested in 
the emerging local and global connectivity structure 
of W. As the network of the connections can be 
best described by a weighted graph, from now on 
'nodes' stand for the neurons, while 'edges' or 'di- 
rected edges' denote the connections among them. 
An insightful way of characterizing graphs has been 
proposed by Watts and Strogatz. They computed 
the characteristic path length (L), which is the aver- 
age number of edges on the shortest path in the net- 
work. They also computed the clustering coefficient 
(C), which is large if the average local connectivity 
is large. For more details, see Rcf. 18. 

In this study, we applied the so called connectiv- 
ity length measure based on the concept of network 
efficiency 19 . This measure is more appropriate for 
weighted networks 20 , equally well applicable for de- 
scribing global and local properties and offers a uni- 
fied theoretical background to characterize our sys- 
tem. According to the definition 20,21 , local efficiency 
between nodes i and j in a weighted network with 
connectivity matrix W is = l/dij, where cor- 
responds to the shortest path length throughout all of 
the possible paths from neuron j to i, where the path 



length between each connected pair of vertices is the 
inverse of the weight between them. For graphs with 
connection strengths of values or 1, dij corresponds 
to the shortest distance between nodes i and j. The 
average of these values = N{ ^_ 1} J2i^j e u) 

characterizes the efficiency of the whole network. 
The local harmonic mean distance for node i is de- 
fined as 

D l h (i) = Jg , lj (3) 

where n' 1 ' is the number of neurons in subgraph GW , 
where subgraph G^' consists of all nodes I around 
neuron i with wu > 0, e\- is the inverse of shortest 
distance between nodes k and j in G"'. N > n"' 
arises when weights may become zero. In terms of ef- 
ficiency, the inverse of this value describes how good 
the local communication is among the first neighbors 
of node i with node i removed. That is why this mea- 
sure can also be regarded as local connectivity length. 
It is a measure of the fault tolerance of the system. 
The mean global distance in the network is defined 
by the following quantity: 

Global distance provides a measure for the size (or 
the diameter) of the network, which influences the 
average time of information transfer. That is why, 
its inverse is used as the (un-normalized) global ef- 
ficiency. According to the literature 20 ' 21 , local har- 
monic mean distance measure behaves like \ jC (in- 
verse of the clustering coefficient), whereas the global 
value is a good approximation of L under certain con- 
ditions. 

Many different networks belong to the same 
structural family regarded as 'small- worlds'. Their 
most characteristic feature is that they are efficient 
locally and globally, too. While local and global con- 
nectedness are useful tools to characterize a network 
architecture, it is worth investigating the degree dis- 
tributions of the incoming and outgoing connections 
as well 22 . They may provide information about the 
scaling of different properties of the given structure, 
like the change of the diameter as a function of the 
number of nodes. One particular subfamily of small- 
world nets can also be characterized as 'scale-free' 
networks, because their most significant properties 
scale according to power-law with the connection 
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number distribution. Most scale-free nets are also 
small- worlds, provided that connection strength is 
not too sparse and basically no part of the network 
is isolated. 

3. Results 




Fig. 2. Log-log plots for different parameters 

The four diagrams display typical distributions for 
parameters (a): r A +/ A - = 0.1 r ex — 0.3, (b): r A +/ A - = 
0.1 r ex = 0.6,(c): r A+/A - = 0.6 r ex = 0.3 and (d): 
r A+ 1 A- = 0.6 r ex = 0.75. Cases (a) and (d) are arbi- 
trary examples from the power law region. 



Figure 2 summarizes our findings in different pa- 
rameter regions. The hgure displays the emergence 
of scale free nets as a function of the excitation level 
r ex , the average ratio of neurons receiving excita- 
tion from the environment, and the ratio of the area 
of potentiation to the area of depression (r^+ ) 
in kernel K. The length of the scale-free regions 
was determined by first plotting the distribution of 
the sum of the weights of outgoing connections (av- 
eraged over 20 runs, each run contains 10000 sam- 
ples) for every parameter set studied. Results are 
depicted on loglog plot. Supposing a power-law dis- 
tribution {P(k*) w fc* 7 e _fe */*, where k* denotes the 
discretized values of the connection strength), a lin- 
ear fitting was made to approximate 7. The width of 
the scale-free region was estimated by the length of 
the region with power-law distribution relative to the 
full length covered on the log scale. Maximum error 
of the linear fit was set to 10~ 3 STD. That is, for 100 



discretization points, the width of a region spread- 
ing an order of magnitude on the loglog plot is equal 
to 0.5. Figure 3 shows the corresponding connection 
matrices. While case (c) resembles a random struc- 
ture, case (b) seems to be a winners-take-most net- 
work, in which only a few neurons dominate over the 
total amount of the connection strength. However, 
cases (a) and (d) show strong clustering in a rather 
sparse structure and therefore correspond to scale- 
free small world networks characterized by their 7 
values ( -1.66 and -1.63, respectively). Figure. 3 de- 
picts the corresponding connectivity matrices. 
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Fig. 3. Connectivity matrices 

The four diagrams display connectivity matrices cor- 
responding to the cases in Fig. 2. Cases (a) and (d) are 
arbitrary examples from the power law region. 

With the help of the above introduced connec- 
tivity length measures we studied also the emerging 
network structures as a function of the following pa- 
rameters: (i) the magnitude of the external excita- 
tion and (ii) the strengthening weakening area ratio 
( r A+ 1 A~ ) °f kernel K. It can be seen that many con- 
nection weights have been vanished and it has made 
possible to talk about 'subgraphs' with local connec- 
tivity As an extreme case of the general model, the 
binary neuron model was also investigated and no 
important difference were found. 

We compared the resulting HebbNet structures 
with a random net, in which the same weights of 
the dynamic network have been randomly assigned 
to different node pairs. Fig. 4 displays the emerging 
connections of a HebbNet for two different param- 
eter sets. Figure 4 highlights clearly the emerging 
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small- world properties, i.e., small local connectivity 
values (high clustering coefficients) for case (d). Al- 
though the global connectivity length was almost the 
same for all HcbbNcts and their corresponding ran- 
dom nets, local distances are much smaller in case 
(d). That is, connectivity structure is sparse but in- 
formation flow is still fault tolerant and efficient. 
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Fig. 4. Local connectivity length distances 

Local connectivity length distances in ascending or- 
der are shown. For better visualization not all data points 
are marked and the points are connected with a solid 
line. Lines with upward triangle markers: STDP learn- 
ing. Lines with circles: same but randomly redistributed 
weights. Line with empty (solid) markers: HebbNet of 
case (c) (case (d)). Global harmonic mean distances for 
the original and for the randomized networks in case 
(c) of Fig. 3 (case (d) of Fig. 3) are about the same 
5.5 {Dl « D£ « 10). 
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The robustness of the network to the external ex- 
citation (i.e., the amount of noise input to the net- 
work) is illustrated on Fig. 5. 
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erage local distances for the evolving network. Circles: 
average local distances for the corresponding random net. 



By increasing the excitation level, the average lo- 
cal connectivity length of the random net is drasti- 
cally increasing, whereas the efficiency of the small- 
world network shows weak dependencies in the same 
region. For the network with parameters /a- = 
0.1 (Fig. 5(A)), there is a sharp cut-off around ex- 
citation level 0.55, where local distances suddenly 
drop, due to the high ratio of excitation. Qualita- 
tively similar behavior can be seen for r A + i A - = 0.6 
(Fig. 5(B)), but the cut-off is around r ex — 0.9. 

Results demonstrated so far characterize the 
'early' stages of network development, as the inter- 
action among neurons is weak due to the low connec- 
tion weight values in all of the above examples. Fig- 
ure 6 demonstrates that even in case of strong inter- 
action, the found structural properties are present: 
According to the figure, the power-law behavior is 
present in a broad range of parameters. For the con- 
stant parameter of Fig. 6 (i.e., for ta+/a- = 0.1) we 
have experienced a convergence of the exponent of 
the power-law distribution to -1. 




Fig. 5. Average local distance vs. excitation ratio 



Fig. 6. Power-law with significant interaction 

Left: exponent of the power law, right: ratio of the 
power-law domain (i.e., ratio of the width of power-law 
distribution region relative to the full length covered on 
the log scale) as a function of r ex and excitation threshold 
0. Parameter r A +/ A - equals to 0.1. Results are averaged 
over 700 steps. Input from other neurons could exceed 
the external inputs by a factor of 10. The power-law ex- 
ponent is about -1 for broad regions of 6 and r ex . Outside 
these regions the network may vanish or start oscillating. 
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4. Discussion and outlook 

One of the most exciting findings in recent 
scientific research is that many complex interac- 
tive systems possess a surprising structural and 
functional property: the emergence of scale-free 
small- world networks (SFNs) of the building blocks. 
Such SFNs may be found in distinct fields rang- 
ing from metabolic reaction chains to social rela- 
tion sys tems 18 < 23 < 24 < 25 < 26 < 20 < 27 . One may find SFNs 
in neurobiology as well. For example, the only case 
of completely mapped neural network of the nema- 
tode worm C. elegans 29 is considered to form a small- 
world network 20 . An outstanding example is the In- 
ternet, which displays this network structure at the 
hardware level of servers and also at the level of web 
pages 25 ' 26 ' 23 . This fascinating self-organizing system 
has inspired several studies and models. The original 
model of the the World Wide Web (WWW) by Watts 
and Strogatz 18 explored random restructuring of the 
links among a finite number of 'nodes'. Barabasi and 
his colleagues introduced the concept of preferential 
attachment to model the WWW 24 ' 25 . The idea has 
been extended to other types of networks 26 and the 
focus has been put on the search of general mecha- 
nisms underlying the development of these distinct 
connection systems. 

4.1. Relation of HebbNet to other models 

Although this paper is intended only to show 
some experimental (simulation) results on noise in- 
duced network structures of simplified neuron mod- 
els, the results can be related to other, well-known 
mechanisms, too. In the following we show that 
under some (strong) constraining assumptions, our 
model can be transformed to the model of Barabasi 
et al 25 , the model of preferential attachment. The 
following assumptions are made to enable the above- 
mentioned transition: 

(i) Let us suppose that at t — there are N nodes, 
from which only n nodes (n « N) have at 
least one connection to other nodes. 

(ii) Let the changes in activity and connection 
strength be discrete by choosing both the weak- 
ening and strengthening step of the kernel to 



be of unit strength. 

(iii) Spikings of the cloud of (N — n) isolated nodes 
can be considered independent and the spiking 
probability is small. For such isolated nodes, 
only the external input, the second part of the 
right hand side of Eq. 1, counts. Furthermore, 
the coincidence of spiking of two isolated neu- 
rons is negligibly small if the temporal kernel 
is short. At any time instant, when a neuron 
of the isolated cloud fires the nodes of the con- 
nected set may fire or not. If no coincidence 
occurs then there will be no change in the net- 
work. However, such coincidences are much 
more likely given the connectivity structure be- 
tween the neurons of the connected set. This 
is so, because if one neuron fires then there is a 
chain of firing amongst these neurons, of th If 
they is In turn, the development of new connec- 
tions between two isolated neurons is not likely, 
whereas isolated neurons tend to develop new 
connections toward the connected sub-net. 

(iv) In contrast to the cloud, the activity of the con- 
nected neurons is strongly dependent on the 
spiking activity of the 'neighbors'. If firing 
starts in the connected cloud of neurons then 
the first term of the right hand side of Eq. 1 
will dominate the resulting firing chain. Input 
initiates the firing chain, whereas recurrent ex- 
citation from other nodes control that chain. In 
turn, the probability of firing can be taken as 
(approximately) proportional to the recurrent 
activity, controlled by the incoming connection 
distribution. 

(v) Having established a connection between two 
nodes, it is kept steady and may not change 
by time. This is a strong assumption, which is 
tacitly assumed by the original model of pref- 
erential attachment, too. 

This latter constraint does not seem to be realis- 
tic in any model. There is no reason that for a grow- 
ing connection structure should remain steady for old 
connections. Note, however, that random rewiring 
of old connections can give rise to scale-free network 
structure, too. In fact, this rewiring mechanism is 
the original model of Watts and Strogatz 18 . As it 
was noted at the very beginning (see Section ) our 
model has an intrinsic weight redistributing property 
originated by the constraint that r A +/ A - < 1. In 



Emergence of scale-free properties in Hebbian networks 



turn, the incremental growing of the connected sub- 
net (by connecting new isolated neurons) and the 
weight redistributing property of HcbbNets can be 
seen as the synthesis of the preferential attachment 
mechanism with continuous new entries in the model 
of preferential attachment 25 and the rewiring mech- 
anism of the model of Watts and Strogatz 18 . That 
is, constraining our model lead to a combination of 
two models both generating small-world structures. 
Nonetheless, by means of numerical simulations we 
have shown that our model can produce such connec- 
tion structures without the explicit requirement on 
growing, and without a direct mechanism of weight 
rewiring. 

4.2. Remarks on evolutionary systems 

Interestingly, all the listed examples, one way or 
the other, usually are also regarded as evolutionary 
systems. In our particular case, the obtained results 
can also be interpreted in an evolutionary context by 
reconsidering Edelman's alternative neuronal group 
selection theory 30 about the fundamental role of se- 
lection during and after development of the nervous 
system. According to Edelman, a theory to describe 
a system's temporal change can be considered as 'se- 
lectionist', if it includes the following components: 

(i) source of diversification leading to variants, 

(ii) a means for encounter with an environment not 
initially categorized, 

(iii) a means for differential amplification over some 
period of time of those variants in a population 
that have greater adaptive value. 

It is no surprise that a system with these features 
falls into the class of evolutionary systems as far as 
we look at the system as a whole. In the nervous 
systems, there are at least two types of temporal 
changes serving the first requirement: Diversification 
can occur via the emergence of redundant connectiv- 
ity during development and via the modification of 
synaptic efficacy during life-time learning. The main 
thesis of this paper is to demonstrate how diversifi- 
cation can be realized by noise under STDP rules. 
The second requirement is fulfilled if the pool of the 
not yet seen input patterns is not limited.^ 

Now, we can argue that noise in the nervous 
system has an important role: Noise has no spa- 

t Considerations about the third requirement are beyond the s 



tiotemporal structure. Thus, obviously it cannot 
induce 'learning' in general sense. However, noise 
with STDP — according to in our results — gives 
rise to a search mechanism, which scans at all scales 
simultaneously. Search in a scale-free manner can 
be most efficient if no structural formation is known 
in advance. The searching feature is robust: The 
noise generated structure is changing rapidly; re- 
sults depicted in the figures are averaged over several 
runs. The continuous change induced by noise can 
be interpreted in the following way. The noise to- 
gether with the proportionally expressed LTD and 
LTP mechanisms yields a continuous sparsification 
and regeneration of the connections. LTP 'chooses' 
sound patterns, whereas LTD helps to 'forget' those 
patterns and maintains a competition amongst pat- 
terns. Synchronous patterns or pattern series are 
quickly learned by HebbNets and approximately sta- 
ble connectivity patterns may emerge. Noise, in 
this case, may modify the connectivity strengths and 
search may be performed 'around' an average stable 
connectivity pattern. Also, the noise may help the 
system to escape from local minima. Noisy Hebbian 
learning, in turn, is able to simultaneously learn cor- 
relations and make selections among the discovered 
structures or patterns. 

As far as other evolving networks are considered, 
the profound implication of our result is that local 
(Hebbian) learning rules may be sufficient to form 
and maintain an efficient network in terms of infor- 
mation flow. This feature differs from existing mod- 
els, such as the model on preferential attachment , 
the global optimization scheme 28 , and also from the 
original Watts and Strogatz model 18 . 

In summary, we have demonstrated that small- 
world architecture with scale-free domains may 
emerge in sustained networks under STDP Hebbian 
learning rule without any other specific constraint 
on the evolution of the net. According to our re- 
sults, evolution and plasticity of neural networks may 
be maintained by noise randomly generated within 
the central nervous system. We conjecture that the 
sustained nature of noise and the competition im- 
posed by appropriate values are the two rel- 
evant components of plasticity and learning. It might 
be equally important that exponents of HebbNets of 
neurons with significant interaction are similar in a 

of the present study. 
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broad range of parameters providing a system more 
stable against homeostatic parameter perturbations. 
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